MWE-sensitive Word Alignment in Factored Translation Model
نویسندگان
چکیده
The factored translation model in Moses (Koehn et al. 2007; Avramidis and Koehn 2008; Koehn 2010), which consists of translation processes followed by a generation process, intends to handle morphologically rich languages by integrating additional linguistic markup at the word level, where each type of additional word-level information is called a factor with the independent assumptions shown in (1):
منابع مشابه
Given Bilingual Terminology in Statistical Machine Translation: MWE-Sensitve Word Alignment and Hierarchical Pitman-Yor Process-Based Translation Model Smoothing
This paper considers a scenario when we are given almost perfect knowledge about bilingual terminology in terms of a test corpus in Statistical Machine Translation (SMT). When the given terminology is part of a training corpus, one natural strategy in SMT is to use the trained translation model ignoring the given terminology. Then, two questions arises here. 1) Can a word aligner capture the gi...
متن کاملMWE Alignment in Phrase Based Statistical Machine Translation
Multiword Expression (MWE) contributes to major lexical ambiguity problems for any language and poses a big challenge in statistical machine translation. This paper presents the role of MWEs in improving the performance of phrase based Statistical machine Translation (PB-SMT) system. We preprocess the parallel corpus by single tokenizing the MWEs on both sides which leads to significant improve...
متن کاملStatistical Approach With Factored Translation Models For Indian Languages
Factored translation models are an extension to phrase based statistical translation models which integrate additional annotation at word level. Here we present a study of statistical models and approaches to translate Hindi to English. Experiments were also conducted on alignment models using various word groupings and using GIZA++ to predict their English translations and fertility. TAJ A new...
متن کاملDiscriminative Modeling of Extraction Sets for Machine Translation
We present a discriminative model that directly predicts which set of phrasal translation rules should be extracted from a sentence pair. Our model scores extraction sets: nested collections of all the overlapping phrase pairs consistent with an underlying word alignment. Extraction set models provide two principle advantages over word-factored alignment models. First, we can incorporate featur...
متن کاملExploiting Translational Correspondences for Pattern-Independent MWE Identification
Based on a study of verb translations in the Europarl corpus, we argue that a wide range of MWE patterns can be identified in translations that exhibit a correspondence between a single lexical item in the source language and a group of lexical items in the target language. We show that these correspondences can be reliably detected on dependency-parsed, word-aligned sentences. We propose an ex...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2010